[HUDI-3547] Introduce MaxwellSourcePostProcessor to extract data from Maxwell json string#4987
Conversation
|
@hudi-bot run azure |
… Maxwell json string
5dd4d57 to
e44cfa5
Compare
f8502b4 to
0f23ca9
Compare
|
@hudi-bot run azure |
|
hi @XuQianJin-Stars, can you help review this ? |
| boolean isDelete = record.get(HoodieRecord.HOODIE_IS_DELETED).booleanValue(); | ||
|
|
||
| assertFalse(isDelete); | ||
| assertNull(database); |
There was a problem hiding this comment.
Why is the database null?
There was a problem hiding this comment.
Why is the database null?
Because the processor have extract the content of data field as the result, field database doesn't belong to data
There was a problem hiding this comment.
Why is the database null?
Because the processor have extract the content of
datafield as the result, fielddatabasedoesn't belong todata
There is a problem. If Maxwell is used to parse multiple tables in the same database at the same time, how can we distinguish between database and table if it is not parsed in record?
There was a problem hiding this comment.
we can config hoodie.deltastreamer.source.json.kafka.post.processor.maxwell.database.regex and hoodie.deltastreamer.source.json.kafka.post.processor.maxwell.table.regex to filter out the right database and table
There was a problem hiding this comment.
original data -> database and table regex -> extract data -> tag delete or not (some more process within delete)-> return
|
This PR is very good overall. |
| // delete | ||
| } else if (DELETE.equals(type)) { | ||
| // tag this record as delete. | ||
| result.put(HoodieRecord.HOODIE_IS_DELETED, true); |
There was a problem hiding this comment.
Can delete logic be put into a method?
There was a problem hiding this comment.
Can delete logic be put into a method?
done
|
+1, when CI is success. |
|
@hudi-bot run azure |
1 similar comment
|
@hudi-bot run azure |
… Maxwell json string (apache#4987) * [HUDI-3547] Introduce MaxwellSourcePostProcessor to extract data from Maxwell json string * add ut * Address comment
… Maxwell json string (apache#4987) * [HUDI-3547] Introduce MaxwellSourcePostProcessor to extract data from Maxwell json string * add ut * Address comment
What is the purpose of the pull request
Introduce MaxwellSourcePostProcessor to extract data from Maxwell json string
Brief change log
Verify this pull request
This change can be verified by
org.apache.hudi.utilities.sources.TestJsonKafkaSourcePostProcessor#testMaxwellJsonKafkaSourcePostProcessor
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.